Hard problems in similarity searching
نویسندگان
چکیده
منابع مشابه
Hard problems in similarity searching
The Closest Substring Problem is one of the most important problems in the field of computational biology. It is stated as follows: given a set of t sequences s1; s2; : : : st over an alphabet , and two integers k; d with d k, can one find a string s of length k and, for all i = 1; 2; : : : ; t, substrings oi of si, all of length k, such that d(s; oi) d (for all i = 1; 2; : : : ; t)? (here, d(:...
متن کاملModel for Similarity Searching ?
The indexing algorithms and data structures for similarity searching in metric spaces seem to emerge from a great diversity, and diierent approaches have been proposed and analyzed separately, often under diierent assumptions. Currently, the only realistic way to compare two diierent algorithms is to apply them to the same data set. We present a uniied model for studying similarity searching al...
متن کاملChemical Similarity Searching
This paper reviews the use of similarity searching in chemical databases. It begins by introducing the concept of similarity searching, differentiating it from the more common substructure searching, and then discusses the current generation of fragment-based measures that are used for searching chemical structure databases. The next sections focus upon two of the principal characteristics of a...
متن کاملMultivariate Time Series Similarity Searching
Multivariate time series (MTS) datasets are very common in various financial, multimedia, and hydrological fields. In this paper, a dimension-combination method is proposed to search similar sequences for MTS. Firstly, the similarity of single-dimension series is calculated; then the overall similarity of the MTS is obtained by synthesizing each of the single-dimension similarity based on weigh...
متن کاملApproximate Searching For Distributional Similarity
Distributional similarity requires large volumes of data to accurately represent infrequent words. However, the nearestneighbour approach to finding synonyms suffers from poor scalability. The Spatial Approximation Sample Hierarchy (SASH), proposed by Houle (2003b), is a data structure for approximate nearestneighbour queries that balances the efficiency/approximation trade-off. We have intergr...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Discrete Applied Mathematics
سال: 2004
ISSN: 0166-218X
DOI: 10.1016/j.dam.2004.06.003